Scalable Adaptation of State Complexity for Nonparametric Hidden Markov Models
نویسندگان
چکیده
Bayesian nonparametric hidden Markov models are typically learned via fixed truncations of the infinite state space or local Monte Carlo proposals that make small changes to the state space. We develop an inference algorithm for the sticky hierarchical Dirichlet process hidden Markov model that scales to big datasets by processing a few sequences at a time yet allows rapid adaptation of the state space cardinality. Unlike previous point-estimate methods, our novel variational bound penalizes redundant or irrelevant states and thus enables optimization of the state space. Our birth proposals use observed data statistics to create useful new states that escape local optima. Merge and delete proposals remove ineffective states to yield simpler models with more affordable future computations. Experiments on speaker diarization, motion capture, and epigenetic chromatin datasets discover models that are more compact, more interpretable, and better aligned to ground truth segmentations than competitors. We have released an open-source Python implementation which can parallelize local inference steps across sequences.
منابع مشابه
Material : Scalable Adaptation of State Complexity for Nonparametric Hidden Markov Models Paper published at NIPS 2015
A Experiment Details 2 A.1 Toy Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 A.2 Speaker Diarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A.3 Motion capture dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A.4 Chromatin epigenomic dataset . . . . . . . . . . . . . . . . . . . . . . ...
متن کاملBayesian time series models and scalable inference
With large and growing datasets and complex models, there is an increasing need for scalable Bayesian inference. We describe two lines of work to address this need. In the first part, we develop new algorithms for inference in hierarchical Bayesian time series models based on the hidden Markov model (HMM), hidden semi-Markov model (HSMM), and their Bayesian nonparametric extensions. The HMM is ...
متن کاملSmall-Variance Asymptotics for Hidden Markov Models
Small-variance asymptotics provide an emerging technique for obtaining scalable combinatorial algorithms from rich probabilistic models. We present a smallvariance asymptotic analysis of the Hidden Markov Model and its infinite-state Bayesian nonparametric extension. Starting with the standard HMM, we first derive a “hard” inference algorithm analogous to k-means that arises when particular var...
متن کاملConsistency of Bayesian nonparametric Hidden Markov Models
We are interested in Bayesian nonparametric Hidden Markov Models. More precisely, we are going to prove the consistency of these models under appropriate conditions on the prior distribution and when the number of states of the Markov Chain is finite and known. Our approach is based on exponential forgetting and usual Bayesian consistency techniques.
متن کاملMinimax Adaptive Estimation of Nonparametric Hidden Markov Models
We consider stationary hidden Markov models with finite state space and nonparametric modeling of the emission distributions. It has remained unknown until very recently that such models are identifiable. In this paper, we propose a new penalized least-squares estimator for the emission distributions which is statistically optimal and practically tractable. We prove a non asymptotic oracle ineq...
متن کامل